WhyWhere is a new ecological niche modeling (ENM) algorithm for mapping and explaining the distribution of species. The algorithm uses image processing methods to efficiently sift through large amounts of data to find the few variables that best predict species occurrence. The purpose of this paper is to describe and justify the main parameterizations and to show preliminary success at rapidly providing accurate, scalable, and simple ENMs. Preliminary results for 6 species of plants and animals in different regions indicate a significant (p<0.01) 14% increase in accuracy over the GARP algorithm using models with few, typically two, variables. The increase is attributed to access to additional data, particularly monthly vs. annual climate averages. WhyWhere is also 6 times faster than GARP on large data sets. A data mining based approach with transparent access to remote data archives is a new paradigm for ENM, particularly suited to finding correlates in large databases of fine resolution surfaces. Software for WhyWhere is freely available, both as a service and in a desktop downloadable form from the web site http://biodi.sdsc.edu/ww_home.html.
Lifemapper (http://www.lifemapper.org) is a predictive electronic atlas of the Earth's biological biodiversity. Using a screensaver version of the GARP genetic algorithm for modeling species distributions, Lifemapper harnesses vast computing resources through 'volunteers' PCs similar to SETI@home, to develop models of the distribution of the worlds fauna and flora. The Lifemapper project's primary goal is to provide an up to date and comprehensive database of species maps and prediction models (i.e. a fauna and flora of the world) using available data on species' locations. The models are developed using specimen data from distributed museum collections and an archive of geospatial environmental correlates. A central server maintains a dynamic archive of species maps and models for research, outreach to the general community, and feedback to museum data providers. This paper is a case study in the role, use and justification of a genetic algorithm in development of large-scale environmental informatics infrastructure.